59 research outputs found
Fast Data Analytics by Learning
Today, we collect a large amount of data, and the volume of the data we collect is projected to grow faster than the growth of the computational power. This rapid growth of data inevitably increases query latencies, and horizontal scaling alone is not sufficient for real-time data analytics of big data. Approximate query processing (AQP) speeds up data analytics at the cost of small quality losses in query answers. AQP produces query answers based on synopses of the original data. The sizes of the synopses are smaller than the original data; thus, AQP requires less computational efforts for producing query answers, thus can produce answers more quickly. In AQP, there is a general tradeoff between query latencies and the quality of query answers; obtaining higher-quality answers requires longer query latencies.
In this dissertation, we show we can speed up the approximate query processing without reducing the quality of the query answers by optimizing the synopses using two approaches. The two approaches we employ for optimizing the synopses are as follows:
1. Exploiting past computations: We exploit the answers to the past queries. This approach relies on the fact that, if two aggregation involve common or correlated values, the aggregated results must also be correlated. We formally capture this idea using a probabilistic distribution function, which is then used to refine the answers to new queries.
2. Building task-aware synopses: By optimizing synopses for a few common types of data analytics, we can produce higher quality answers (or more quickly for certain target quality) to those data analytics tasks. We use this approach for constructing synopses optimized for searching and visualizations.
For exploiting past computations and building task-aware synopses, our work incorporates statistical inference and optimization techniques. The contributions in this dissertation resulted in up to 20x speedups for real-world data analytics workloads.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138598/1/pyongjoo_1.pd
AirIndex: Versatile Index Tuning Through Data and Storage
The end-to-end lookup latency of a hierarchical index -- such as a B-tree or
a learned index -- is determined by its structure such as the number of layers,
the kinds of branching functions appearing in each layer, the amount of data we
must fetch from layers, etc. Our primary observation is that by optimizing
those structural parameters (or designs) specifically to a target system's I/O
characteristics (e.g., latency, bandwidth), we can offer a faster lookup
compared to the ones that are not optimized. Can we develop a systematic method
for finding those optimal design parameters? Ideally, the method must have the
potential to generate almost any existing index or a novel combination of them
for the fastest possible lookup.
In this work, we present new data and an I/O-aware index builder (called
AirIndex) that can find high-speed hierarchical index designs in a principled
way. Specifically, AirIndex minimizes an objective function expressing the
end-to-end latency in terms of various designs -- the number of layers, types
of layers, and more -- for given data and a storage profile, using a
graph-based optimization method purpose-built to address the computational
challenges rising from the inter-dependencies among index layers and the
exponentially many candidate parameters in a large search space. Our empirical
studies confirm that AirIndex can find optimal index designs, build optimal
indexes within the times comparable to existing methods, and deliver up to 4.1x
faster lookup than a lightweight B-tree library (LMDB), 3.3x--46.3x faster than
state-of-the-art learned indexes (RMI/CDFShop, PGM-Index, ALEX/APEX, PLEX), and
2.0 faster than Data Calculator's suggestion on various dataset and storage
settings.Comment: 13 pages, 3 appendices, 19 figures, to appear at SIGMOD 202
Unified Hierarchical Relationship Between Thermodynamic Tradeoff Relations
Recent years have witnessed a surge of discoveries in the studies of
thermodynamic inequalities: the thermodynamic uncertainty relation (TUR) and
the entropic bound (EB) provide a lower bound on the entropy production (EP) in
terms of nonequilibrium currents; the classical speed limit (CSL) expresses the
lower bound on the EP using the geometry of probability distributions; the
power-efficiency (PE) tradeoff dictates the maximum power achievable for a heat
engine given the level of its thermal efficiency. In this study, we show that
there exists a unified hierarchical structure encompassing all of these bounds,
with the fundamental inequality given by a novel extension of the TUR (XTUR)
that incorporates the most general range of current-like and state-dependent
observables. By selecting more specific observables, the TUR and the EB follow
from the XTUR, and the CSL and the PE tradeoff follow from the EB. Our
derivations cover both Langevin and Markov jump systems, with the first proof
of the EB for the Markov jump systems and a more generalized form of the CSL.
We also present concrete examples of the EB for the Markov jump systems and the
generalized CSL.Comment: 19 pages, 4 figure
ElasticNotebook: Enabling Live Migration for Computational Notebooks (Technical Report)
Computational notebooks (e.g., Jupyter, Google Colab) are widely used for
interactive data science and machine learning. In those frameworks, users can
start a session, then execute cells (i.e., a set of statements) to create
variables, train models, visualize results, etc. Unfortunately, existing
notebook systems do not offer live migration: when a notebook launches on a new
machine, it loses its state, preventing users from continuing their tasks from
where they had left off. This is because, unlike DBMS, the sessions directly
rely on underlying kernels (e.g., Python/R interpreters) without an additional
data management layer. Existing techniques for preserving states, such as
copying all variables or OS-level checkpointing, are unreliable (often fail),
inefficient, and platform-dependent. Also, re-running code from scratch can be
highly time-consuming. In this paper, we introduce a new notebook system,
ElasticNotebook, that offers live migration via checkpointing/restoration using
a novel mechanism that is reliable, efficient, and platform-independent.
Specifically, by observing all cell executions via transparent, lightweight
monitoring, ElasticNotebook can find a reliable and efficient way (i.e.,
replication plan) for reconstructing the original session state, considering
variable-cell dependencies, observed runtime, variable sizes, etc. To this end,
our new graph-based optimization problem finds how to reconstruct all variables
(efficiently) from a subset of variables that can be transferred across
machines. We show that ElasticNotebook reduces end-to-end migration and
restoration times by 85%-98% and 94%-99%, respectively, on a variety (i.e.,
Kaggle, JWST, and Tutorial) of notebooks with negligible runtime and memory
overheads of <2.5% and <10%.Comment: Accepted to VLDB 202
Recommended from our members
Body Mass Index and Decline of Cognitive Function
Background: The association between body mass index (BMI) and cognitive function is a public health issue. This study investigated the relationship between obesity and cognitive impairment which was assessed by the Korean version of the Mini-mental state examination (K-MMSE) among mid- and old-aged people in South Korea. Methods: A cohort of 5,125 adults, age 45 or older with normal cognitive function (K-MMSE≥24) at baseline (2006), was derived from the Korean Longitudinal Study of Aging (KLoSA) 2006~2012. The association between baseline BMI and risk of cognitive impairment was assessed using multiple logistic regression models. We also assessed baseline BMI and change of cognitive function over the 6-year follow-up using multiple linear regressions. Results: During the follow-up, 358 cases of severe cognitive impairment were identified. Those with baseline BMI≥25 kg/m2 than normal-weight (18.5≤BMI<23 kg/m2) were marginally less likely to experience the development of severe cognitive impairment (adjusted odds ratio [aOR] = 0.73, 95% CI = 0.52 to 1.03; Ptrend = 0.03). This relationship was stronger among female (aOR = 0.63, 95% CI = 0.40 to 1.00; Ptrend = 0.01) and participants with low-normal K-MMSE score (MMSE: 24–26) at baseline (aOR = 0.59, 95% CI = 0.35 to 0.98; Ptrend<0.01). In addition, a slower decline of cognitive function was observed in obese individuals than those with normal weight, especially among women and those with low-normal K-MMSE score at baseline. Conclusion: In this nationally representative study, we found that obesity was associated with lower risk of cognitive decline among mid- and old-age population
All-rounder: A flexible DNN accelerator with diverse data format support
Recognizing the explosive increase in the use of DNN-based applications,
several industrial companies developed a custom ASIC (e.g., Google TPU, IBM
RaPiD, Intel NNP-I/NNP-T) and constructed a hyperscale cloud infrastructure
with it. The ASIC performs operations of the inference or training process of
DNN models which are requested by users. Since the DNN models have different
data formats and types of operations, the ASIC needs to support diverse data
formats and generality for the operations. However, the conventional ASICs do
not fulfill these requirements. To overcome the limitations of it, we propose a
flexible DNN accelerator called All-rounder. The accelerator is designed with
an area-efficient multiplier supporting multiple precisions of integer and
floating point datatypes. In addition, it constitutes a flexibly fusible and
fissionable MAC array to support various types of DNN operations efficiently.
We implemented the register transfer level (RTL) design using Verilog and
synthesized it in 28nm CMOS technology. To examine practical effectiveness of
our proposed designs, we designed two multiply units and three state-of-the-art
DNN accelerators. We compare our multiplier with the multiply units and perform
architectural evaluation on performance and energy efficiency with eight
real-world DNN models. Furthermore, we compare benefits of the All-rounder
accelerator to a high-end GPU card, i.e., NVIDIA GeForce RTX30390. The proposed
All-rounder accelerator universally has speedup and high energy efficiency in
various DNN benchmarks than the baselines
Electroless Gold Plating on Aluminum Patterned Chips for CMOS-based Sensor Applications
We presented an approach for the activation of aluminum Al alloy using palladium Pd and the subsequent gold Au electroless
plating ELP for complementary metal oxide semiconductor CMOS -based sensor applications. In this study, CMOS process
compatible Al patterned chips were used as substrates for easy incorporation with existing CMOS circuits. To improve the contact
resistance that arose from the Schottky barrier between the metal electrodes and the single-walled carbon nanotubes SWCNTs ,
electroless deposition of gold that has a higher work function than Al was adopted because the SWCNTs has p-type semiconductor
properties. Each step of the Au ELP procedure was studied under various bath temperatures, immersion times, and chemical
concentrations. Fine Pd particles were homogeneously distributed on the Al surface by the Pd activation process at room temperature.
Au ELP allowed selective deposition of the Au film on the activated Al surface only. The SWCNT networks formed on
the Au plated chip by a dip-coating method showed improved contact resistance and resistance variation between the Au electrode
and SWCNTs. We also tried SWCNT decoration with the Au particle using the upper Au ELP method, which was expected to be
applied in various areas including field-effect transistors and sensor devices.This work was supported by the Nano Systems Institute-National
Core Research Center NSI-NCRC program of NRF and the
TDPAF, Ministry for Agriculture, Forestry and Fisheries, Republic
of Korea
- …